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APPENDIX A - Research Methodology and Analyses 

Overview 

This section provides the overview of the study including how the sample was selected, the outcome variables used, and analysis steps. 

Constructing the Survey Data File 

This section describes how the principal, teacher, and superintendent sun’ey data were entered, data cleaning and recoding, and 
statistical reliability of the survey items. 

Constructing Composite Independent Variables (Subdomains) 



This section describes the conceptual and technical development of subdomain composite variables from individual survey items that 
measure various schooling practice areas. 

Constructing Longitudinal Outcome Variables 

This section describes the use of the special longitudinal data file obtained from the California Department of Education (CDE) to 
develop longitudinal outcome variables that controlled for past student performance. 

Constructing Data Files for Analysis 

This section describes how both the longitudinal and cross-sectional data were utilized in the analyses in addition to listing the 
control variables used in the study. 

Specifying Predictor Pools 

This section describes the tools developed to effectively map the sun’ev items into subdomains, and the subdomains into domains. 



Regression Analyses 

This section describes the primary analytic technique — regression analysis — and lists the steps taken for the analysis. 



Statistical Comparisons Across Study Domains 

This section describes the statistical methodology used in comparing domains. 
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APPENDIX A - Research Methodology and Analyses 
Overview 

Statistical analyses for the Middle Grades Study were carried out primarily by the Principal Data 
Analyst, Jesse Levin, who is a Senior Research Scientist at American Institutes for Research 
(AIR). Overall responsibility for planning and coordinating the analyses rested with — Senior 
Technical Director — Edward Haertel, who is a professor in the School of Education at Stanford 
University. Levin and Haertel were ably assisted by Ben Webman and other EdSource staff members. 
The project team met approximately twice per month from December 2008 through January 2010, with 
more frequent meetings as needed. 

This was a complicated study, using over 1 ,000 variables derived from three separate surveys (of 
principals, teachers, and superintendents) to predict school-level outcomes on seven different 
California Standards Tests (CSTs). As described in one PowerPoint presentation, the study required 
analysis of over 1,000,000 teacher item responses, over 100,000 principal item responses, and nearly 
30,000 superintendent item responses. Over 400 distinct regression models were examined. 
Specification of all these analyses required over 6,300 lines of statistical programming. Over 20,000 
variables were created at various points in the process. The school CST score means serving as 
outcome variables were derived from the test scores of over 200,000 students. 

California public schools with both 7th and 8th grade students served as the primary sampling units 
and as the unit of analysis. These included both middle schools and K-8 elementary schools. The 
sample was further restricted to schools within two bands (the 20th-35th and 70th-85th percentiles) of 
the California Department of Education School Characteristics Index (SCI), a composite of 
demographic variables indicating the degree of educational challenge each school confronts. 1 Of the 
528 schools in this target sample, 133 were eliminated because their school districts declined to 
participate (typically citing time pressures and uncertainties due to the current funding climate in 
California), or a school had closed or consolidated with another. Of the 395 schools contacted, 303 
provided both teacher and principal data used in the study. Of these schools, 244 also had 
corresponding surveys that were completed by the superintendent presiding over their district or, in the 
case of charter schools, the chief administrator of the charter management organization. Within each 
participating school, all regular mathematics and/or English language arts (ELA) teachers of 6th, 7th, 
or 8th grade students were surveyed. 

The main outcome variables used were school-level means of the CSTs in English Language Arts for 
grades 6 through 8 (ELA6, ELA7, and ELA8), Mathematics for grades 6 and 7, and for grade 8 
General Mathematics and Algebra I (Math6, Math7, Math8Gen, and Math8Alg). Analyses were 
based solely upon school-level data from students taking a CST without modifications. That is, no use 
was made of data from the California Modified Assessment (CMA), Standards-based Tests in Spanish 
(STS), or California Alternate Performance Assessment (CAPA). As described in the body of this 
report, schools were recruited from two demographic bands defined by the 2006-07 SCI. The 
combined set of all schools in both the 20th to 35th and 70th to 85th SCI percentile bands is referred to 
as the pooled sample. While most analyses used this pooled sample, some analyses used only the 
20th-35th band or the 70th-85th band schools. Only a subset of the schools served students in the 6th 



1 A report on the construction of the SCI is available at http://www.cde.ca.gov/ta/ac/ap/documents/tdgreport0400.pdf . Details of the 
2007 SCI (used for sample selection) are available at http ://www. cde. ca, gov/ta/ac/ap/documents/tdgreport0708 .pdf . 
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